Code Compression for VLIW Processors

نویسندگان

  • Yuan Xie
  • Haris Lekatsas
  • Wayne H. Wolf
چکیده

Code compression is an important issue in the design of an embedded system, since memory has been one of the most restricted resources. Most of the previous work in code compression has targeted RISC architectures, although VLIW processors have gained a lot of popularity recently. In this research, we explore methods to the problem of compressing code for VLIW processors. Previous code compression algorithms for VLIW are dictionary-based schemes and target traditional VLIW architectures, which have rigid instruction word formats. Based on an arithmetic coding algorithm, we present several compression schemes for modern VLIW architectures, which have very flexible instruction format to achieve code density. The compression algorithm we used in our research is a reduced-precision arithmetic coder in combination with a Markov model[1]. We investigated several modern VLIW architectures, selecting Texas Instrument’s TMS320C6x and Lucent’s StarCore DSP SC140, as well as Intel/HP’s IA-64. Modern VLIW ISAs adapt a VLES (various length execution set) idea to achieve high code density for minimized cost. There are two kinds of instruction packets in the code. One is called fetch packet and the other is called execute packet. A fetch packet consists of a fixed number of instructions that are always fetched together at a time, while all instructions that execute in parallel constitute an execute packet. Based on the characteristics of modern VLIW ISA’s instruction format, we propose two schemes to improve the compression ratio. The first is to increase the compression block size to be the size of fetch packet; the second is multiple-model scheme, which works for VLIW ISAs that have more restricted instruction format, such as IA-64. Basically, the multiplemodel scheme constructs different models for different parts of the long instruction word. Fast decompression is particularly important for VLIW machines. Because instruction words are long, existing code compression algorithms introduce very long delays into instruction execution during decompression. Thus we also explore the trade-offs between compression ratio and decompression speed. In order to enable parallel decompression for each fetch packet, we divide the whole fetch packet into sub-block and attach tag in front of each compressed sub-block to indicate the size of current compressed sub-block, such that the decompresor knows the location of the next compressed sub-block. Another scheme is to take a vertical compression approach. All the fetch packets in the code are divided to be several streams. We compress each stream in parallel. Since each stream has its own LAT table, which records each compressed block’s position, we don’t need to attach a tag to each compressed block. This is actually also a multi-model approach since each stream has its own statistical model. Both schemes sacrifice the compression ratio for a faster decompression speed. Reference: [1] Lekatsas and Wolf, “ SAMC: A code compression algorithm for embedded processors”, IEEE Transactions on CAD, December 1999. Yuan Xie, Haris Lekatsas, Wayne Wolf Electrical Engineering Department NEC USA, Princeton University 4 Independent way Princeton, NJ 08540 Princeton,NJ 08540 {yuanxie,wolf}@ee.princeton.edu [email protected]

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Dictionary-Based Code Compression in VLIW Architectures

Reducing code size is crucial in embedded systems as well as in high-performance systems to overcome the communication bottleneck between memory and CPU, especially with VLIW (Very Long Instruction Word) processors that require a high-bandwidth instruction prefetching. This paper presents a new approach for dictionary-based code compression in VLIW processor-based systems using isomorphism amon...

متن کامل

Dynamic Scheduling Techniques for VLIW Processors

instruction-level parallelism, VLIW processors, superscalar processors, pipelining, multiple operation issue, scoreboarding, dynamic scheduling, out-of -order execution VLIW processors are viewed as an attractive way of achieving instruction-level parallelism because of their ability to issue multiple operations per cycle with relatively simple control logic. They are also perceived as being of...

متن کامل

Co-design of Compiler and Hardware Techniques to Reduce Program Code Size on a VLIW Processor

Code size is a primary concern in the embedded computing community. Minimizing physical memory requirements reduces total system cost and improves performance and power efficiency. VLIW processors rely on the compiler to statically encode the ILP in the program before its execution, and because of this, code size is larger relative to other processors. In this paper we describe the co-design of...

متن کامل

An Automatic System for Application-Specific Instruction Format Design and Code Generation for VLIW and EPIC processors

Introduction. Whereas the workstation and personal computer markets are rapidly converging on a small number of similar architectures, the embedded systems market is enjoying an explosion of architectural diversity. This diversity is driven by demands for higher performance at a lower cost and power consumption, and is propelled by the possibility of designing application-specific instruction-s...

متن کامل

Machine-Description Driven Compilers for EPIC and VLIW Processors

In the past, due to the restricted gate count available on an inexpensive chip, embedded DSPs have had limited parallelism, few registers and irregular, incomplete interconnectivity. More recently, with increasing levels of integration, embedded VLIW processors have started to appear. Such processors typically have higher levels of instruction-level parallelism, more registers, and a relatively...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001